Design of Departmental Metacomputing ML

نویسنده

  • Frédéric Gava
چکیده

Bulk Synchronous Parallel ML or BSML is a functional dataparallel language for programming bulk synchronous parallel (BSP) algorithms. The execution time can be estimated and dead-locks and indeterminism are avoided. For large scale applications, more than one parallel machine is needed. We consider here the design and cost-model of a BSML-like language devoted to the programming of such applications: Departmental Metacomputing ML or DMML. Introduction. Bulk-Synchronous Parallel ML (BSML) is an extension of ML for programming Bulk Synchronous Parallel (BSP) algorithms as functional programs. BSP computing [4] is a parallel programming model introduced by Valiant to offer a high degree of abstraction like PRAM models. Such algorithms offer portable, predictable and scalable performances on a wide variety of architectures. BSML expresses them with a small set of primitives taken from the confluent BSλ-calculus. Those operations are implemented as a parallel library (http://bsmllib.free.fr) for the functional programming language Objective Caml (http://www.ocaml.org). In recent years there has been a trend towards using a set of parallel machines in particular SMP clusters for these kinds of problems. Programming this kind of supercomputers is still difficult and libraries, languages and formal cost models are needed. Computing with these “cluster of clusters” is usually called departmental metacomputing. BSML is not suited for departmental metacomputing, in particular to the heterogeneous nature of computing resources and networks. In fact the BSP model itself is not well suited to these two-tiered architectures. This paper describes our first work on the design of a model and a functional programming language for departmental metacomputing. Bulk Synchronous Parallelism. A BSP computer contains a set of processormemory pairs, a communication network allowing inter-processor delivery of messages and a global synchronization unit which executes collective requests for a synchronization barrier. Its performance is characterized by 3 parameters expressed as multiples of the local processing speed: p is the number of processor-memory pairs, l is the time required for a global synchronization and g is the time for collectively delivering a 1-relation (communication phase where every processor receives/sends at most one word). The network can deliver an M. Bubak et al. (Eds.): ICCS 2004, LNCS 3038, pp. 50–53, 2004. c © Springer-Verlag Berlin Heidelberg 2004 Design of Departmental Metacomputing ML 51 h-relation in time gh for any arity h. A BSP program is executed as a sequence of super-steps, each one divided into three successive disjoint phases. In the first phase each processor uses its local data to perform sequential computations and to request data transfers to/from other nodes. In the second phase the network delivers the requested data transfers and in the third phase a global synchronization barrier occurs, making the transferred data available for the next super-step. The execution time of a super-step s is thus the sum of the maximal local processing time, of the data delivery time and of the global synchronization time: Time(s) = max i:processor w (s) i + max i:processor h (s) i ∗ g + l where w i = local processing time and hi is the number of words transmitted (or received) on processor i during super-step s. The execution time of a BSP program is therefore the sum of each super-steps. There are some arguments against using BSP for metacomputing. First the global synchronization barrier is claimed to be expensive and the BSP model too restrictive. For example, divide-and-conquer parallel algorithms are a class of algorithms which seem to be difficult to write using the BSP model. This problem can be overcome at the programming language level without modifying the BSP model. The main problem is that this model does no take into account the different capacity of the parallel machines and of the networks. So bulk synchronous parallelism does not seem to be suitable for metacomputing. The global synchronization barrier could be remove. [3] introduces MPM which is a model directly inspired by the BSP model. It proposes to replace the notion of super-step by the notion of m-step defined as: at each m-step, each process performs a sequential computation phase then a communication phase. During this communication phase the processes exchange the data they need for the next m-step. The model uses the set of “incoming partners” for each process at each m-step. Execution time for a program is thus bounded the number of m-steps of the program. The MPM model takes into account that a process only synchronizes with each of its incoming partners and is therefore more accurate. The heterogeneous nature of networks has been investigated. [2] proposes a two-levels Hierarchical BSP Model for a supercomputing without subset synchronization and two levels of communications. A BSP2 computer consists of a number of uniformly BSP units, connected by a communication network. The execution of a BSP2 program proceeds in hyper-steps separated by global synchronizations. On each hyper-step each BSP unit performs a complete BSP computation (some super-steps) and communicates some data with other BSP units. However, the authors noted that none of the algorithms they have analyzed showed any significant benefit from this approach and the experiments do not follow the model. The failure of the BSP2 model comes from two main reasons: first the BSP units are generally different in practice and second the time required for the synchronization of all the BSP units is too expensive. DMM: Departmental Metacomputing Model. To preserve the work made on BSP algorithms and to deal with the different architectures of each parallel

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Functional Language for Departmental Metacomputing

We have designed a functional data-parallel language called BSML for programming bulk synchronous parallel (BSP) algorithms. Deadlocks and indeterminism are avoided and the execution time can be then estimated. For very large scale applications more than one parallel machine could be needed. One speaks about metacomputing. A major problem in programming application for such architectures is the...

متن کامل

Standards Based Heterogeneous Metacomputing: The Design of HARNESS II

Emerging trends in heterogeneous distributed metacomputing and in Web Services technologies exhibit several commonalities that each domain can exploit. In this paper, we present an architectural model and design issues in leveraging Web Services to construct metacomputing frameworks. Our design is based on a combination of concepts currently embodied in the Harness system and those implemented ...

متن کامل

High Throughput Computing: Stealing Unused Cycles

High Throughput Computing systems (HTC) enable otherwise idles cycles to be available to computations that involve many independent tasks. In a Distributed Computing environment, distributed ownership of computing resources is the major obstacle HTC has to overcome to take advantage of under-utilized systems in the network. This paper addresses HTC and how it fits in current parallel and distri...

متن کامل

Design of an Artificial-Neural-Network-Based

This paper analyzes a serious limitation of existing metacomputing directory service of Globus project that the existing metacomputing directory service doesn’t support application-oriented queries, and then designs an artificial-neural-network-based GRC (grid resources classifier) to eliminate this limitation. This classifier extends the metacomputing directory service by classifying grid reso...

متن کامل

PVM Emulation in the Harness Metacomputing System: A Plug-in Based Approach

Metacomputing frameworks have received renewed attention of late, fueled both by advances in hardware and networking, and by novel concepts such as computational grids. Harness is an experimental metacomputing system based upon the principle of dynamic reconfigurability not only in terms of the computers and networks that comprise the virtual machine, but also in the capabilities of the VM itse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004